chore(ai): Add check-code-attribution Claude Code skill by 0xadam-brown · Pull Request #5401 · getsentry/sentry-java

0xadam-brown · 2026-05-11T11:49:21Z

📜 Description

Adds a check-code-attribution skill that verifies license headers + THIRD_PARTY_NOTICES.md entries for code copied or adapted from third parties. Reports any invalid headers and entries from the branch diff (and only the branch diff), along with suggestions for their correction.

See screenshot + manual test outputs below for examples.

Implementation notes

Uses a cheap-to-excute and easy-to-update script to detect attribution candidates (see find-attribution-candidates.sh), esp as we eventually want to run this on CI.
Relies on the LLM for fuzzy tasks (viz, determining the adequacy of attribution candidates + generating suggestions).
Includes a shell-based test harness + test suite (see test-find-attribution-candidates.sh; executed manually).

Co-authored-by: Claude Opus 4.6 (1M context) noreply@anthropic.com

💡 Motivation and Context

Third-party code attribution is a legal and compliance requirement. Currently, attribution correctness is only caught during manual code review. This skill automates detection of vendored code in branch diffs and can help us flag missing or incomplete attributions before a PR is merged.

Background: Click to expand

Sentry SDKs and third-party code

3 possible ways third-party code enters Sentry’s SDKs (including sentry-java):

1. Plain vanilla dependencies
2. Shaded code
3. Vendored code

All third-party code must be properly attributed, and licenses must be compatible with Sentry’s licensing policies.

Plain deps + shaded code: We run an enforce-license-compliance GitHub workflow that applies a FOSSA check to all plain vanilla dependencies and our few shaded dependencies, which ensures their licenses are properly attributed and are compatible with Sentry’s licensing policies.
Vendored code: Relies on a manual process where developers add attributions to files containing vendored code + include a corresponding entry is included in the THIRD_PARTY_NOTICES.md file that ships with the SDK. Developers are also responsible for ensuring license compatibility.

The criteria for what counts as a proper attribution of vendored code lives in the CODE_ATTRIBUTION_CRITERIA.md file under the heading “Third-Party Code Attribution”.

Goal of this PR: Create a skill that helps us properly attribute vendored code

Types of vendored code:

Vendored code that’s already properly attributed.
Vendored code that has an attribution, but it’s incomplete or doesn’t otherwise conform to the criteria from CODE_ATTRIBUTION_CRITERIA.md.
Vendored code that has no attribution / no indication that it’s vendored.

The skill introduced in this PR protects (1) from regression and identifies instances of (2). (Addressing (3) is out of scope – and is obviously non-trivial.)

addresses: JAVA-499

⚠️ Callouts

Skill does not mandate that license headers exactly match the template from CODE_ATTRIBUTION_CRITERIA.md so long as all template fields are present.

That^^ lets us maintain our current, diverse header formats and remain relatively unopinionated going forward. Let me know if you think we should be strict about things, and I can update.

Screenshot of output

Example 1

Example 2

On the fence about displaying false positives (included to be conservative, as they helped me identify LLM quirks as I developed; they should be rare regardless). Let me know if you disagree. Happy to update.

💚 How did you test it?

Created the test suite (test-find-attribution-candidates.sh) which covers candidate detection, rename tracking, NOTICES entry matching, deleted files, generated code exclusions, and edge cases
Manually invoked the skill on branches with known attribution issues to verify correct detection and reporting. (See output below.)

Manual tests + output: Click to expand

Note the skill's output format has changed since these tests were run, but the behavior otherwise remains the same.

Diff 1: Remove entire license header

diff --git a/sentry-android-core/src/main/java/io/sentry/android/core/ANRWatchDog.java b/sentry-android-core/src/main/java/io/sentry/android/core/ANRWatchDog.java
index b726dd0c8..0c11522c1 100644
--- a/sentry-android-core/src/main/java/io/sentry/android/core/ANRWatchDog.java
+++ b/sentry-android-core/src/main/java/io/sentry/android/core/ANRWatchDog.java
@@ -1,27 +1,4 @@
-/*
- * Adapted from https://github.com/SalomonBrys/ANR-WatchDog/blob/1969075f75f5980e9000eaffbaa13b0daf282dcb/anr-watchdog/src/main/java/com/github/anrwatchdog/ANRWatchDog.java
- *
- * The MIT License (MIT)
- *
- * Copyright (c) 2016 Salomon BRYS
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy of
- * this software and associated documentation files (the "Software"), to deal in
- * the Software without restriction, including without limitation the rights to
- * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
- * the Software, and to permit persons to whom the Software is furnished to do so,
- * subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in all
- * copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
- * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
- * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
- * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
- */
+/* ANRWatchDog implementation */

Output 1

⚠️ File: io.sentry.android.core.ANRWatchDog
Required attribution field(s) removed:
- The entire MIT license header (ANR-WatchDog by Salomon BRYS) was stripped and replaced with a generic comment. Restore the original attribution header.

Diff 2: Modify existing license header, but retain all required fields

diff --git a/sentry/src/main/java/io/sentry/CircularFifoQueue.java b/sentry/src/main/java/io/sentry/CircularFifoQueue.java
index 8fa72e39d..f0d7d6084 100644
--- a/sentry/src/main/java/io/sentry/CircularFifoQueue.java
+++ b/sentry/src/main/java/io/sentry/CircularFifoQueue.java
@@ -1,20 +1,12 @@
 /*
  * Adapted from https://github.com/apache/commons-collections/blob/fce46cdcc6fa33ba9472921d4b3ec3f548d8cbcc/src/main/java/org/apache/commons/collections4/queue/CircularFifoQueue.java
  *
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
+ * Copyright 2025 The Apache Software Foundation.
+ * Licensed under the Apache License, Version 2.0
  * (the "License"); you may not use this file except in compliance with
  * the License.  You may obtain a copy of the License at
  *
  *      http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
  */

 package io.sentry;

Output 2

👀 File: io.sentry.CircularFifoQueue
Vendored code detected (Apache Commons Collections) – verify that THIRD_PARTY_NOTICES.md reflects your updates.

Diff 3: Modify existing license header by removing one or more required fields

diff --git a/sentry/src/main/java/io/sentry/transport/ReusableCountLatch.java b/sentry/src/main/java/io/sentry/transport/ReusableCountLatch.java
index cfa0b68a6..fb7c4171e 100644
--- a/sentry/src/main/java/io/sentry/transport/ReusableCountLatch.java
+++ b/sentry/src/main/java/io/sentry/transport/ReusableCountLatch.java
@@ -1,8 +1,6 @@
 /*
  * Adapted from https://github.com/MatejTymes/JavaFixes/blob/37e74b9d0a29f7a47485c6d1bb1307f01fb93634/src/main/java/javafixes/concurrency/ReusableCountLatch.java
  *
- * Copyright (C) 2016 Matej Tymes
- *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
  * You may obtain a copy of the License at
@@ -14,8 +12,6 @@
  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  * See the License for the specific language governing permissions and
  * limitations under the License.
- *
- * Additional Sentry modifications: Updated synchronizer for virtual thread compatibility.
  */

 package io.sentry.transport;

Output 3

⚠️ File: io.sentry.transport.ReusableCountLatch
Required attribution field(s) removed:
- The copyright line Copyright (C) 2016 Matej Tymes was removed from the license header. Please restore it.

Diff 4: Leave existing license header unchanged, but make an inconsistent modification to THIRD_PARTY_NOTICES.md entry

diff --git a/THIRD_PARTY_NOTICES.md b/THIRD_PARTY_NOTICES.md
index 5a48d567f..08b6cd78a 100644
--- a/THIRD_PARTY_NOTICES.md
+++ b/THIRD_PARTY_NOTICES.md
@@ -94,42 +94,14 @@ limitations under the License.

 **Source:** https://github.com/square/tape (Commit: 445cd3fd0a7b3ec48c9ea3e0e86663fe6d3735d8)<br>
 **License:** Apache License 2.0<br>
-**Copyright:** Copyright (C) 2010 Square, Inc.
+**Copyright:** Copyright (C) 2015 Square, Inc.

Output 4

⚠️ NOTICES entry modified: Square — Tape (Apache 2.0)
Entry metadata inconsistent with source file headers:
- Copyright year changed to 2015 in THIRD_PARTY_NOTICES.md, but the source files (QueueFile.java, FileObjectQueue.java, ObjectQueue.java) all still say "Copyright (C) 2010
  Square, Inc."

Diff 5: Leave existing license header unchanged, but remove THIRD_PARTY_NOTICES.md entry

diff --git a/THIRD_PARTY_NOTICES.md b/THIRD_PARTY_NOTICES.md
index 5a48d567f..08b6cd78a 100644
--- a/THIRD_PARTY_NOTICES.md
+++ b/THIRD_PARTY_NOTICES.md
@@ -94,42 +94,14 @@ limitations under the License.
-
-## Square — Seismic (Apache 2.0)
-
-**Source:** https://github.com/square/seismic<br>
-**License:** Apache License 2.0<br>
-**Copyright:** Copyright 2010 Square, Inc.
-
-### Scope
-
-The Sentry Java SDK includes an adapted version of Square's Seismic shake detection algorithm. The rolling sample window approach and `SampleQueue`/`SamplePool` data structures in `io.sentry.android.core.SentryShakeDetector` are based on Seismic's `ShakeDetector`.

Output 5

⚠️ NOTICES entry removed: Square — Seismic (Apache 2.0)
Source file(s) still reference this library:
- io.sentry.android.core.SentryShakeDetector still contains attribution header for Square's Seismic. Either restore the THIRD_PARTY_NOTICES.md entry or remove the vendored code.

Diff 6: Add newly-vendored code with valid license header and THIRD_PARTY_NOTICES.md entry

diff --git a/sentry/src/main/java/io/sentry/util/SlidingWindow.java b/sentry/src/main/java/io/sentry/util/SlidingWindow.java
new file mode 100644
index 000000000..936aa0687
--- /dev/null
+++ b/sentry/src/main/java/io/sentry/util/SlidingWindow.java
@@ -0,0 +1,42 @@
+// Adapted from Metrics-Java SlidingWindowReservoir.
+// Copyright 2010-2023 Coda Hale and Yammer, Inc.
+// Licensed under the Apache License, Version 2.0.
+// https://github.com/dropwizard/metrics/blob/main/metrics-core/src/main/java/com/codahale/metrics/SlidingWindowReservoir.java
+package io.sentry.util;
+
+import java.util.concurrent.atomic.AtomicLong;
+
+public final class SlidingWindow<T> {

Output 6

👀 File: io.sentry.util.SlidingWindow
Vendored code detected (Dropwizard Metrics SlidingWindowReservoir) – verify that THIRD_PARTY_NOTICES.md reflects your updates.

Diff 7: Add newly-vendored code with valid license header but no THIRD_PARTY_NOTICES.md entry

diff --git a/sentry/src/main/java/io/sentry/util/ConcurrentLruCache.java b/sentry/src/main/java/io/sentry/util/ConcurrentLruCache.java
new file mode 100644
index 000000000..330b92794
--- /dev/null
+++ b/sentry/src/main/java/io/sentry/util/ConcurrentLruCache.java
@@ -0,0 +1,36 @@
+// Adapted from Caffeine Cache.
+// Copyright 2024 Ben Manes.
+// Licensed under the Apache License 2.0.
+// https://github.com/ben-manes/caffeine/blob/master/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java
+package io.sentry.util;
+
+import java.util.LinkedHashMap;
+import java.util.Map;
+
+/**
+ * A simple thread-safe LRU cache backed by a synchronized LinkedHashMap.
+ */
+public final class ConcurrentLruCache<K, V> {
…

Output 7

⚠️ File: io.sentry.util.ConcurrentLruCache
Vendored code detected (Caffeine Cache) — attribution header is complete.
- No corresponding entry in THIRD_PARTY_NOTICES.md. An entry needs to be added.

Diff 8: Add newly-vendored code with an invalid license header and existing THIRD_PARTY_NOTICES.md entry

diff --git a/sentry/src/main/java/io/sentry/util/LeakyBucket.java b/sentry/src/main/java/io/sentry/util/LeakyBucket.java
new file mode 100644
index 000000000..4bce4f602
--- /dev/null
+++ b/sentry/src/main/java/io/sentry/util/LeakyBucket.java
@@ -0,0 +1,38 @@
+// Adapted from Resilience4j RateLimiter.
+// https://github.com/resilience4j/resilience4j/blob/master/resilience4j-ratelimiter/src/main/java/io/github/resilience4j/ratelimiter/internal/AtomicRateLimiter.java
+package io.sentry.util;
+
+public final class LeakyBucket {

Output 8

⚠️ File: io.sentry.util.LeakyBucket
Vendored code detected (Resilience4j RateLimiter) — missing required fields:
- Header is missing the copyright holder and license name. Add lines for Copyright 2019 Robert Winkler and Bohdan Storozhuk and Licensed under the Apache License, Version 2.0
  (per the existing THIRD_PARTY_NOTICES.md entry).

Diff 9: Add newly-vendored code with an invalid license header and no THIRD_PARTY_NOTICES.md entry

diff --git a/sentry/src/main/java/io/sentry/util/TokenBucket.java b/sentry/src/main/java/io/sentry/util/TokenBucket.java
new file mode 100644
index 000000000..d09f06af9
--- /dev/null
+++ b/sentry/src/main/java/io/sentry/util/TokenBucket.java
@@ -0,0 +1,36 @@
+// Adapted from Guava RateLimiter.
+// https://github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/RateLimiter.java
+package io.sentry.util;
+
+/**
+ * A simple token bucket rate limiter.
+ */
+public final class TokenBucket {
…

Output 9

⚠️ File: io.sentry.util.TokenBucket
Vendored code detected (Guava RateLimiter) — missing required fields:
- Header is missing copyright year, copyright holder, and license name. Only the source URL and library name are present.
- No corresponding entry in THIRD_PARTY_NOTICES.md. An entry needs to be added for Guava RateLimiter.

Diff 10: Add newly-vendored code with an invalid license header, no THIRD_PARTY_NOTICES.md entry, and a new license type

diff --git a/sentry/src/main/java/io/sentry/util/CompactJsonWriter.java b/sentry/src/main/java/io/sentry/util/CompactJsonWriter.java
new file mode 100644
index 000000000..d9c9a342e
--- /dev/null
+++ b/sentry/src/main/java/io/sentry/util/CompactJsonWriter.java
@@ -0,0 +1,60 @@
+/*
+ * Copyright 2024 Example JSON Project Contributors.
+ * SPDX-License-Identifier: BSD-3-Clause
+ * https://github.com/example-json/compact-writer
+ */
+package io.sentry.util;
+
+import java.io.IOException;
+import java.io.Writer;
+
+/**
+ * A lightweight JSON writer that produces compact (no whitespace) output.
+ */
+public final class CompactJsonWriter {
…

Output 10

⚠️ File: io.sentry.util.CompactJsonWriter
Vendored code detected (compact-writer) — missing required fields:
- Header is missing an "Adapted from" statement identifying the library name. Copyright, license (BSD-3-Clause), and source URL are present.
- No corresponding entry in THIRD_PARTY_NOTICES.md. An entry needs to be added.
- ❗This license type (BSD-3-Clause) is not yet represented in THIRD_PARTY_NOTICES.md. Please verify it is compatible with Sentry's licensing policies:
  https://open.sentry.io/licensing/.

Diff 11: False positive

diff --git a/THIRD_PARTY_NOTICES.md b/THIRD_PARTY_NOTICES.md
index 5a48d567f..57c0cc359 100644
--- a/THIRD_PARTY_NOTICES.md
+++ b/THIRD_PARTY_NOTICES.md
@@ -484,3 +484,81 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
+
+---
+
+## Eclipse Collections — CircularArrayList (EPL 2.0)
+
+**Source:** https://github.com/eclipse/eclipse-collections/blob/master/eclipse-collections/src/main/java/org/eclipse/coll
ections/impl/list/mutable/CircularArrayList.java<br>
+**License:** Eclipse Public License 2.0<br>
+**Copyright:** Copyright (c) 2022 Goldman Sachs and others
+
+### Scope
+
+The Sentry Java SDK includes an adapted circular buffer implementation from Eclipse Collections. The code resides in io.sentry.util.CircularBuffer.
+
+Copyright (c) 2022 Goldman Sachs and others.
+
+This program and the accompanying materials are made available under the
+terms of the Eclipse Public License 2.0 which is available at
+http://www.eclipse.org/legal/epl-2.0.
+
+SPDX-License-Identifier: EPL-2.0

Output 11 (numbered as “1.” because it’s the first entry in the False Positives section)

THIRD_PARTY_NOTICES.md — Flagged because attribution markers were added to it. This is expected; the file is the notices file itself.

📝 Checklist

I added GH Issue ID & Linear ID
I added tests to verify the changes.
No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
I updated the docs if needed.
I updated the wizard if needed.
Review from the native team if needed.
No breaking change or entry added to the changelog.
No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

Once folks are fine with the skill proper, I'll post a second PR adding it to the create-java-pr task, and a third PR introducing a GitHub Actions workflow so we can run it on CI.

#skip-changelog

linear-code · 2026-05-11T11:49:25Z

JAVA-499

sentry · 2026-05-11T11:53:54Z

📲 Install Builds

Android

🔗 App Name	App ID	Version	Configuration
SDK Size	io.sentry.tests.size	8.41.0 (1)	release

⚙️ sentry-android Build Distribution Settings

runningcode · 2026-05-11T13:24:00Z

+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"


Can we use license gradle plugin or another license checking tool to find/enforce this?
For example: https://github.com/hierynomus/license-gradle-plugin

Ah, great question. But I don't think so b/c this PR is about validating attributions for vendored / copied / adapted code, rather than dependencies declared in our Gradle scripts.

My understanding is libraries like license-gradle-plugin only address the latter, and we already have the enforce-license-compliance workflow that does that sort of thing.

(After you posted your comment, I added a "Background" dropdown under "Motivation and Context" in the PR description, which captures my current understanding. Could be mistaken, of course, so correct me if I goofed or misread your question.)

Yeah I think it will only help with the header part of the problem. The license-gradle-plugin scans files to make sure they have a license header. You can give specific excludes and includes paths as well as the header format.

It doesn't help with the THIRD_PARTY_NOTICES though.

Gotcha – and thx for pointing me in the right direction. I took a look at the license-gradle-plugin and had a chat with Claude about it. It's in the dropdown.

tl;dr: we could consider using a library to expand our skill's functionality by generating canonical headers in cases where we've already detected an invalid header. But there doesn't seem to be a good fit for detecting invalid headers to begin with – and, as you mentioned, no library will help us keep the headers and THIRD_PARTY_CONTRIBUTIONS in lockstep.

That^^ suggests sticking with the current approach as a base (viz, detection + informing the developer of defects) and thinking about whether canonical header generation is worth the extra investment. I plan to extend the check-code-attribution skill so it can be run on CI in a follow-on PR. My vote would be to delay a deeper dive until then, but let me know if you think otherwise 👍.

Claude convo: Click to expand

Me:

What do you think about this comment: #5401 (comment)? Is it feasible to use the library runningcode mentions?

Claude:

The license-gradle-plugin is actually not a good fit, even for the header part:

Diverse, non-uniform headers. The plugin is designed to enforce a single, uniform header across your codebase (e.g., "all .java files must have the Apache 2.0 Sentry copyright header"). Your attribution headers are different per file — each vendored file has its own library name, copyright holder, license type, and source URL. The plugin can't check for that.

Detection is the hard problem. The core value of the skill is identifying which files are vendored/adapted in the first place (via heuristics + LLM judgment). license-gradle-plugin has no concept of this — it just checks/stamps headers on files you explicitly include.

It would fight with your existing first-party headers. Line 115 of the script already notes that many first-party files carry the project's own license header. The plugin would struggle to distinguish "this file needs a third-party attribution header" from "this file correctly has our own header."

Only covers Java/Kotlin. Your skill also scans shell scripts, config files, etc. — the Gradle plugin only operates within Gradle's source sets.

Me:

Taking a look the License Maven Plugin that underlies license-gradle-plugin (https://mathieu.carbou.me/license-maven-plugin/configuration/), could we use it (either the maven or gradle version) to generate attribution headers once we've identified a candidate attribution and determined it's invalid?

Ie, we use the find-attribution-candidates.sh to locate candidate attributions; we ask the LLM to identify invalid attributions among that group; and then we have the LLM add the invalid attribution's file to the maven/gradle plugin set for that license type. Finally, the skill runs the plugin to generate the attribution(s) as defined by the maven/gradle plugin and replace the invalid attribution(s).

Thoughts?

Claude:

Good creative thinking, but I don't think it works well in practice. Here's why:

The Gradle plugin can't do it. Since sentry-java is a Gradle project, license-gradle-plugin is the natural choice — but it only supports one header template with one set of properties globally. No licenseSets, no per-file properties. You'd need multiple custom tasks, one per vendored file.

The Maven plugin can, but... It supports licenseSets (multiple header templates, each with their own includes/properties) and even headerSections for finer-grained substitution. But integrating a Maven plugin into a Gradle build is awkward — you'd be shelling out to mvn or pulling in a Maven executor.

The deeper issue: the config is the attribution. To set up a licenseSet for a file, the LLM needs to know the library name, copyright holder, license type, and source URL — the exact same information that goes into the 4-line header comment. Writing a licenseSet XML/Gradle block is more work than just writing the header directly into the file. You'd essentially be duplicating the attribution in two places (build config + file header), creating a maintenance burden.

Where a plugin shines vs. where it doesn't:

Plugin excels: stamping 500 files with the same Apache 2.0 Sentry copyright header

Plugin struggles: stamping 10 files with 10 different headers (different library, copyright holder, license, URL each time)

Our problem is the second case. The LLM already has the information and can write the 4-line comment directly — adding a plugin intermediary just adds indirection and config maintenance for no quality gain.

I think your current approach (LLM identifies the problem, LLM fixes the header directly) is actually the right one. The plugin would be solving a problem we don't have.

Me:

Gtk. Are there other libraries besides the mvn license plugin and the https://github.com/hierynomus/license-gradle-plugin that could do either what runningcode wants or what I just suggested?

Claude:

Here's what's out there:

Best fit: REUSE tool (fsfe/reuse-tool)

Python CLI + GitHub Action. Purpose-built for exactly the vendored code scenario. You declare per-file/per-glob license and copyright in a REUSE.toml, and it can both lint (reuse lint) and stamp headers (reuse annotate). It's opinionated about SPDX format, but that's arguably a plus for standardization. Not Gradle-native though.

Runner-up: Apache SkyWalking Eyes (license-eye)

Go CLI + GitHub Action. Supports multiple header sections in .licenserc.yaml, each with different SPDX IDs and copyright owners mapped to different path globs. Can both check and fix. Closest to a "configure once, enforce in CI" solution with per-file customization.

Gradle-native options

Yumi Gradle Licenser (dev.yumi.gradle.licenser) — Actively maintained, supports multiple license header rules evaluated per-file in definition order. Closest Gradle-native option to what you'd need.

CadixDev Licenser (org.cadixdev.licenser) — Has matching() blocks for different headers on different file patterns, but lightly maintained.

Not a fit

google/addlicense — Single header per invocation, no config file. You'd need to script multiple runs.

johann-petrak/licenseheaders — Same limitation, single template per run.

My take

None of these solve the detection problem (finding which files are vendored). They all assume you already know which files need which headers. So they can't replace find-attribution-candidates.sh + LLM evaluation.

Where they could add value is as a CI enforcement layer: once attributions are established, a tool like REUSE or SkyWalking Eyes could verify in CI that headers haven't drifted. But your skill already does that verification, and it also handles THIRD_PARTY_NOTICES.md, which none of these tools touch.

Net: these tools solve adjacent problems well, but none of them cover the full scope of what the skill does. If you wanted to adopt one, REUSE is the most philosophically aligned — but it would be additive to the skill, not a replacement for any part of it.

Adds a check-code-attribution skill that verifies license headers + THIRD_PARTY_NOTICES.md entries for code copied or adapted from third parties. Reports any invalid headers and entries in the branch diff, along with suggestions for their correction. Implementation notes: - Uses a cheap-to-excute and easy-to-update script to detect attribution candidates (see find-attribution-candidates.sh). - Relies on the LLM for fuzzy tasks (viz, determining the adequacy of candidate attributions + generating suggestions). - Includes a shell-based test harness with 30 tests covering edge cases (see test-find-attribution-candidates.sh). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…-candidates script + minor LLM prompt updates

…te reasons + remove the gh api comment-cleanup block as it's a holdover from an earlier CI implementation + minor cleanup.

…bility

markushi

Looks very solid to me, left two comments for discussion, but already in a good shape!

markushi · 2026-05-13T11:16:26Z

+Run the pre-filter and analysis script:
+
+```bash
+bash .claude/skills/check-code-attribution/find-attribution-candidates.sh


I would have kept this one a bit simpler: Do a git diff and leave all output interpretation to the agent. But of course, this would also put all trust into the model. How about a hybrid: Do the git diff and the check for common patterns as part of the script, but leave all output interpretation to the model?

If we are only running on a git diff in PR and not the entire codebase why not just use an to check the code instead of using this script?
Or add something to warden to check for attribution candidates?

Happy to defer to folks here.

My original approach used a script less to keep judgment out of the LLMs hands, and more to keep costs down. (We want to run this skill for every sentry-java CI build, very few of which will actually touch license headers or THIRD_PARTY_NOTICES.)

I still think those cost-savings are worth the extra effort of creating our own Github actions workflow rather than going with Warden (which doesn't permit scripts but would otherwise be a great choice). I've posted a version that applies @markushi's "hybrid" approach; feel free to chime in if you have a strong view about the Warden route.

I'm not bash expert so maybe it just my bias here. LLMs obviously help a lot but 1400 lines of bash is quite a maintenance burden.

I executed the script locally to test and found that it would match things like CHANGELOG.md and gradlew but not io.sentry.android.core.ApplicationNotResponding.java. That's the opposite of what we want.

If just ask claude Can you check the codebase to find any third party code that isn't attributed in the @THIRD_PARTY_NOTICES.md? I get a more accurate answer which includes the above as well as some other pieces of our codebase that are not attributed correctly.
That is why I think running an LLM on the diff with that question is a more accurate and more easily maintainable solution. We already run warden, seer, bugbot and cursor on each PR. We can just add a rule to one of these that checks this.

That's just my 2cents. Feel free to merge this though.

Removes the following functionality from our find-attribution-candidates script and hands it over to the LLM: 1. URL-based NOTICES matching 2. License type detection 3. "New license type" check 4. Removed/modified NOTICES entry detection

…bution stripping during renames

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 3127d5b. Configure here.}

cursor · 2026-05-15T08:17:43Z

+ATTRIBUTION_PATTERN="$VENDORING_MARKERS|copyright|licensed|$LICENSE_NAMES"
+
+# Sentry entity names — copyright lines mentioning these are treated as first-party
+SENTRY_ENTITIES='functional software|getsentry|sentry software'


Fast path bypasses slow path's "sentry" entity handling

Low Severity

SENTRY_ENTITIES ('functional software|getsentry|sentry software') omits bare sentry, but SENTRY_STRIP_SED on line 62 independently strips sentry in the slow path. The fast path at line 122 (grep -qivE "$SENTRY_ENTITIES") short-circuits and returns 0 (third-party) for any copyright line not matching the three entity patterns — including "Copyright 2024 Sentry" (the format used in the project's own LICENSE file). The slow path that correctly handles this case is never reached.

Additional Locations (2)

.claude/skills/check-code-attribution/find-attribution-candidates.sh#L121-L122

.claude/skills/check-code-attribution/find-attribution-candidates.sh#L61-L62

^{Reviewed by Cursor Bugbot for commit 3127d5b. Configure here.}

markushi

Looking pretty good already, left a few smallish comments

markushi · 2026-05-15T08:03:03Z

+/generated/
+
+# Gradle build scripts
+build\.gradle(\.kts)?$


technically those could contain any vendored snippets, right?

markushi · 2026-05-15T08:05:57Z

+# Update these when new attribution patterns or license types are encountered.
+
+# Strong indicators that code was adapted/copied from an external source
+VENDORING_MARKERS='adapted from|backported from|copied from|derived from|ported from|translated from|vendored'


markushi · 2026-05-15T08:12:25Z

+  done < "$EXCLUSIONS_FILE" > "$EXCLUSION_PATTERNS_FILE"
+fi
+
+is_generated_file() {


whilst most of the time it's generated files, i'd generalize this a bit

Suggested change

is_generated_file() {

should_exclude_file() {

markushi · 2026-05-15T08:17:55Z

+    git ls-files --others --exclude-standard 2>/dev/null | while IFS= read -r path; do
+      [[ -n "$path" ]] && printf 'A\t%s\n' "$path"
+    done
+  } | awk -F'\t' '{


definitely your call here, but a sort -u is maybe "good enough" for improved readability here

0xadam-brown force-pushed the chore/check-code-attribution-skill branch 2 times, most recently from e5ea3f3 to 3a6b60d Compare May 11, 2026 11:51

This comment was marked as outdated.

Sign in to view

runningcode reviewed May 11, 2026

View reviewed changes

0xadam-brown changed the title ~~chore(ai): Add check-code-attribution Claude Code skill~~ chore(ai): Add check-code-attribution Claude Code skill May 11, 2026

0xadam-brown force-pushed the chore/check-code-attribution-skill branch from 3a6b60d to 97e7c65 Compare May 11, 2026 13:44

0xadam-brown marked this pull request as ready for review May 11, 2026 14:25

0xadam-brown requested review from adinauer, markushi and romtsn as code owners May 11, 2026 14:25

cursor Bot reviewed May 11, 2026

View reviewed changes